Binary Substructure Descriptors for Organic Compounds*
نویسندگان
چکیده
Organic chemical structures are represented by binary vectors that contain information about presence or absence of 1365 substructures. The guiding ideas for selecting this set of substructures are described and examples are given. Software SubMat has been developed for a fast and flexible computation of binary substructure descriptors from molecular structures. Examples from structure similarity searches demonstrate the performance of representing organic chemical structures by the described set of substructures.
منابع مشابه
Quantitative Structure - Activity Relationships Study of Carbonic Anhydrase Inhibitors Using Logistic Regression Model
Binary Logistic Regression (BLR) has been developed as non-linear models to establish quantitative structure- activity relationships (QSAR) between structural descriptors and biochemical activity of carbonic anhydrase inhibitors. Using a training set consisted of 21 compounds with known ki values, the model was trained and tested to solve two-class problems as active or inactive on the basi...
متن کاملDevelopment of binary classification of structural chromosome aberrations for a diverse set of organic compounds from molecular structure.
Classification models are generated to predict in vitro cytogenetic results for a diverse set of 383 organic compounds. Both k-nearest neighbor and support vector machine models are developed. They are based on calculated molecular structure descriptors. Endpoints used are the labels clastogenic or nonclastogenic according to an in vitro chromosomal aberration assay with Chinese hamster lung ce...
متن کاملProteochemometric mapping of the interaction of organic compounds with melanocortin receptor subtypes.
Proteochemometrics was applied in the analysis of the binding of organic compounds to wild-type and chimeric melanocortin receptors. Thirteen chimeric melanocortin receptors were designed based on statistical molecular design; each chimera contained parts from three of the MC(1,3-5) receptors. The binding affinities of 18 compounds were determined for these chimeric melanocortin receptors and t...
متن کاملGet the best from substructure mining
The chemical information that is present in a set of compounds is rarely fully exploited. This is mostly because no descriptor set can capture all biologically important features. As a result, valuable chemical knowledge can thus stay hidden from hypothesis-based drug design. The simplest form of a structure-activity relationship (SAR) is a substructure that predisposes compounds towards reduce...
متن کاملQuantitative Modeling for Prediction of Critical Temperature of Refrigerant Compounds
The quantitative structure-property relationship (QSPR) method is used to develop the correlation between structures of refrigerants (198 compounds) and their critical temperature. Molecular descriptors calculated from structure alone were used to represent molecular structures. A subset of the calculated descriptors selected using a genetic algorithm (GA) was used in the QSPR model development...
متن کامل